Exploiting Clustering Techniques for Web User-session Inference

نویسندگان

  • A. Bianco
  • G. Mardente
  • M. Mellia
  • M. Munafò
  • L. Muscariello
چکیده

We focus on the definition and identification of “Web user-sessions”, an aggregation of several TCP connections generated by the same source host on the basis of TCP connection opening time. The identification of a user session is non trivial; traditional approaches rely on threshold based mechanisms, which are very sensitive to the value assumed for the threshold, which may be difficult to correctly set. By applying clustering techniques, we define a novel methodology to identify Web user-sessions without requiring an a priori definition of threshold values. We discuss pros and cons of this approach, and we define a methodology to be applied to real traffic traces. The proposed methodology is evaluated on artificially generated traces to show its benefits against traditional threshold based approaches. We then analyze the characteristics of user sessions extracted from real traces, studying the statistical properties of the identified sessions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Use of Semantic Similarity and Web Usage Mining to Alleviate the Drawbacks of User-Based Collaborative Filtering Recommender Systems

  One of the most famous methods for recommendation is user-based Collaborative Filtering (CF). This system compares active user’s items rating with historical rating records of other users to find similar users and recommending items which seems interesting to these similar users and have not been rated by the active user. As a way of computing recommendations, the ultimate goal of the user-ba...

متن کامل

User Navigation Pattern Discovery using Fast Adaptive Neuro-Fuzzy Inference System

World Wide Web is a huge repository of web pages and links. It provides abundance information for the Internet users. The growth of web is incredible as it can be seen in present days. Users’ accesses are recorded in web logs. From the user’s perspective, it is very difficult to extract useful knowledge from the huge amount of information and secondly, it is also difficult to extract for the us...

متن کامل

Quantitative Evaluation of Performance and Validity Indices for Clustering the Web Navigational Sessions

Clustering techniques are widely used in “Web Usage Mining” to capture similar interests and trends among users accessing a Web site. For this purpose, web access logs generated at a particular web site are preprocessed to discover the user navigational sessions. Clustering techniques are then applied to group the user session data into user session clusters, where intercluster similarities are...

متن کامل

A Compressive Survey on Restructuring User Search Results by Using Feedback Session

this internet search engine relevance may be enhanced by means of considering end user search goal. In addition to the individual search engine optimization experience is usually increased through inferring individual search goals. This paper proposes a novel approach to infer user search goals by analyzing search engine query logs known as feedback session. First framework is proposed to disco...

متن کامل

A Subtractive Relational Fuzzy C-Medoids Clustering Approach To Cluster Web User Sessions from Web Server Logs

In this paper, a subtractive relational fuzzy c-medoids clustering approach is discussed to identify web user session clusters from weblogs, based on their browsing behavior. In this approach, the internal arrangement of data along with the density of pairwise dissimilarity values is favored over arbitrary starting estimations of medoids as done in the conventional relational fuzzy c-medoids al...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005